Session C1

Federated Learning and Reinforcement Learning

Conference
1:30 PM — 2:50 PM HKT
Local
Dec 1 Tue, 9:30 PM — 10:50 PM PST

Incentive Mechanism Design for Federated Learning: A Two-stage Stackelberg Game Approach

Guiliang Xiao, Mingjun Xiao, Guoju Gao, Sheng Zhang, Hui Zhao and Xiang Zou

0
Federated Learning (FL) is a newly-emerging distributed ML model, where a server can coordinate multiple workers to cooperatively train a learning model by using their private datasets, while ensuring these datasets not to be revealed to others. In this paper, we focus on the incentive mechanism design for FL systems. Taking the incentives into consideration, we first design two utility functions for the server and workers, respectively. Then, we model the corresponding utility optimization problem as a two-stage Stackelberg game by seeing the server as a leader and the workers as some followers. Next, we derive an optimal Equilibrium solution for the both stages of the whole game. Based on this solution, we design an incentive mechanism that can ensure the server to achieve the optimal utility, while stimulating workers to do their best to train the ML model. Finally, we conduct extensive simulations to demonstrate the significant performance of the proposed mechanism.

Time Efficient Federated Learning with Semi-asynchronous Communication

Jiangshan Hao, Yanchao Zhao and Jiale Zhang

0
With the explosive growth of massive data generated by smart Internet of Things (IoT) devices, federated learning has been envisioned as a promising technique to provide distributed machine learning services while protecting training data privacy. However, conventional federated learning protocols have shown significant drawbacks in regards of efficiency and scalability. First, since the synchronous communication model of federated learning and the computation capability of each device is different, the straggled users could severely desegregate the efficiency. Second, in synchronous communication, there is no effective client selection mechanism to make the model perform better in the early stage. Third, how to coordinate the communication of various nodes to accelerate global convergence is also one of the issues that need to be considered. To solve the above-mentioned problems, we propose a semi-asynchronous federated learning mechanism where a data expansion method is used to effectively reduce the stragglers existing in both synchronous and asynchronous communication models. Moreover, we also designed a priority function to make the accuracy increase rapidly in the early stage. Experimental results demonstrate that our proposed method have higher accuracy and faster

Multi-agent Fault-tolerant Reinforcement Learning with Noisy Environments

Canhui Luo, Xuan Liu, Xinning Chen and Juan Luo

0
Multi-agent reinforcement learning system is used to solve the problem that agents achieve specific goals in the interaction with the environment through learning policies. Almost all existing multi-agent reinforcement learning methods assume that the observation of the agents is accurate during the training process. It does not take into account that the observation may be wrong due to the complexity of the actual environment or the existence of dishonest agents, which will make the agent training difficult to succeed. In this paper, considering the limitations of the traditional multi-agent algorithm framework in noisy environments, we propose a multi-agent fault-tolerant reinforcement learning (MAFTRL) algorithm. Our main idea is to establish the agent��s own error detection mechanism and design the information communication medium between agents. The error detection mechanism is based on the autoencoder, which calculates the credibility of each agent��s observation and effectively reduces the environmental noise. The communication medium based on the attention mechanism can significantly improve the ability of agents to extract effective information. Experimental results show that our approach accurately detects the error observation of the agent, which has good performance and strong robustness in both the traditional reliable environment and the noisy environment. Moreover, MAFTRL significantly outperforms the traditional methods in the noisy environment.

Decentralized Exploration of a Structured Environment Based on Multi-agent Deep Reinforcement Learning

Dingjie He, Dawei Feng, Hongda Jia and Hui Liu

0
Multi-robot environment exploration is one of the widely discussed topics in the field of robotics. It is the foundation for many real-world robotic applications. Many decentralized methods (that is, without a centralized controller) have been proposed in the past decades. Most of them focus on improving collaboration efficiency by utilizing low-level heuristic information, such as distances to obstacles and robot positions. In contrast, although a human being can make decisions on a similar task, he/she exploits highlevel knowledge, such as the building��s common structure pattern. This paper proposes a novel distributed multi-robot exploration algorithm based on deep reinforcement learning (DME-DRL) for structured environments that enables robots to make decisions on the basis of this high-level knowledge. DMEDRL is a distributed algorithm that uses deep neural networks to extract the structural pattern of the environment, and it can work in scenarios with or without communication. The experimental results showed that this approach can decrease the travel distance by approximately 10.84% on average, compared with those of traditional heuristic methods and can significantly reduce the communication cost in the exploration process.

Session Chair

Dawei Feng (National University of Defense Technology)

Session C2

Crowd Sourcing and Parallel Acceleration

Conference
1:30 PM — 2:50 PM HKT
Local
Dec 1 Tue, 9:30 PM — 10:50 PM PST

D2D-Enabled Reliable Data Collection for Mobile Crowd Sensing

Pengfei Wang, Zhen Yu, Chi Lin, Leyou Yang, Yaqing Hou and Qiang Zhang

0
With increasing more powerful sensing capacities of mobile devices, the Mobile Crowd Sensing (MCS) system requires to collect larger sensing data from participants. Nevertheless, collecting such large volume of data will cost a lot for participants, base stations and MCS server. Even worse, some sensing data cannot satisfy the MCS sensing requirement due to the low quality and are filtered by the MCS server in clouds. Inspired by the D2D technique, where mobile devices can communicate directly with the help of the nearby base station, in 5G networks, we propose the Reliable Data Collection (RDC) algorithm to validate the generated sensing data at device sides in this paper. To be specific, the whole progress is formulated as a Probability problem of Discovering Reliable sensing data (PDR) at client sides, and Expectation Maximization (EM) is leveraged to devise the algorithm. Finally, the extensive simulations and real-world use case are conducted to evaluate the performance of RDC algorithm, and the result shows that RDC outperforms the other two benchmarks in estimating accuracy and saving data collection cost.

Improving the Applicability of Visual Peer-to-Peer Navigation with Crowdsourcing

Erqun Dong, Jianzhe Liang, Zeyu Wang, Jingao Xu, Longfei Shangguan, Qiang Ma and Zheng Yang

0
Visual peer-to-peer navigation is a suitable solution for indoor navigation for it relieves the labor of site-survey and eliminates infrastructure dependence. However, a major drawback hampers its application, as the peer-to-peer mode suffers from a deficiency of paths in large indoor scenarios with multifarious places-of-interest. Nevertheless, we propose one with a profound crowdsourcing scheme that addresses the drawback by merging the paths of different leaders�� into a global map. To realize the idea, we further deal with entailed challenges, namely the unidirectional disadvantage, the scale ambiguity, and large computational overhead. We design a navigation strategy to solve the unidirectional problem and turn to VIO to tackle scale ambiguity. We devise a mobileedge architecture to enable real-time navigation (30fps, 100ms end-to-end delay) and lighten the burden of smartphones (35% battery life for 2h35min) while assuring the accuracy of localization and map construction. Through experimental validations, we show that P2P navigation, previously relying on the abundance of independent paths, can enjoy a sufficiency of navigation paths with a crowdsourced global map. The experiments demonstrate a navigation success rate of 100% and spatial offset of less than 3.2m, better than existing works.

Massively Parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution

Wassapon Watanakeesuntorn, Keichi Takahashi, Kohei Ichikawa, Joseph Park, George Sugihara, Ryousei Takano, Jason Haga and Gerald M. Pao

0
Empirical Dynamic Modeling (EDM) is a nonlinear time series causal inference framework. The latest implementation of EDM, cppEDM, has only been used for small datasets due to computational cost. With the growth of data collection capabilities, there is a great need to identify causal relationships in large datasets. We present mpEDM, a parallel distributed implementation of EDM optimized for modern GPU-centric supercomputers. We improve the original algorithm to reduce redundant computation and optimize the implementation to fully utilize hardware resources such as GPUs and SIMD units. As a use case, we run mpEDM on AI Bridging Cloud Infrastructure (ABCI) using datasets of an entire animal brain sampled at single neuron resolution to identify dynamical causation patterns across the brain. mpEDM is 1,530�� faster than cppEDM and a dataset containing 101,729 neuron was analyzed in 199 seconds on 512 nodes. This is the largest EDM causal inference achieved to date.

An Effective Design to Improve the Efficiency of DPUs on FPGA

Yutian Lei, Qingyong Deng, Saiqin Long, Shaohui Liu and Sangyoon Oh

0
Convolutional neural networks (CNNs) have been widely used in various complicated problems, such as image classification, objection detection, semantic segmentation. To meet diversified CNN structures, the deep learning processing unit (DPU) is designed as a general accelerator on field programmable gate array (FPGA) to support various CNN layers, such as convolution, pooling, activation, etc. However, low DPU utilization and schedule efficiency appear when DPU used to multitask application completed by CNN models. In this paper, an effective design including multi-core with different size (MCDS) and DPU Plus is proposed to improve the efficiency of DPUs usage from the two dimensions of time and space. Through increasing the number of DPU cores on an FPGA and the utilization of single DPU core, the design of MCDS can effectively improve the overall throughput with restricted on-chip resources. Furthermore, the design of DPU Plus is proposed to improve the schedule efficiency of DPUs through simultaneously implementing DPU with other significant auxiliary modules of the application system on the same FPGA. Finally, a color space conversion module is implemented cooperate to the DPU cores to testify its performance, and the experimen shows that compared with running on the the CPU completely, it achieves16.2x acceleration, and increases the throughput of the entire system by 3.0x.

Session Chair

Shigeng Zhang (Central South University)

Session C3

Localization and Cross-technology Communication

Conference
2:50 PM — 4:10 PM HKT
Local
Dec 1 Tue, 10:50 PM — 12:10 AM PST

Lightweight Mobile Devices Indoor Location Based on Image Database

Ran Gao, Yanchao Zhao and Maoxing Tang

0
Among the numerous indoor localization technologies, image-based solution has great advantages on convenient access from smartphone and its infrastructure-less deployment. However, the image-based localization also suffers from two key disadvantages, which hinders the universal application. Firstly, it requires a large amount of computing and storage resources, which is difficult to achieve for the mobile device, while cloudbased scheme incurs unacceptable delay. Secondly, although this solution doesn��t require infrastructure, it still suffers from labor intensive image-database construction and updates. To overcome these limitations, we propose an image-based indoor localization method featured with realtime localization and labor-less image database update. This method mainly innovates in two aspects. First, we propose a mobile device compatible image database compressing framework, which enable realtime and accurate ondevice image searching even in a large scenario. Our localization method achieves resource efficiency (in terms of storage and processing) by only keeping image feature vectors, and employing the efficient k-mean tree to search for the best matched image. Secondly, to achieve labor-less image database updating, we mainly add high-quality and informative query image into the database. These query image could compensate the missing information or changed scenario in a up-to-date manner. We conduct real experiments in Android Platform to verify the feasibility and performance of the localization method. Experiment results show that our method has good accuracy (90% location errors are within 1.5m) and high real-time performance (average location delay is less than 0.5s).

A Dynamic Escape Route Planning Method for Indoor Multi-floor Buildings Based on Real-time Fire Situation Awareness

Chun Wang, Juan Luo, Cuijun Zhang and Xuan Liu

0
The complicated interior structure of the highrise buildings brings great difficulties for fire escape routes planning. Existing two-dimensional (2D) emergency evacuation models are utilized to solve the problem of guidance and rescue for fire responders. However, these models are faced with a bottleneck of low security due to limited environmental information and no consideration of trapped personnel behavior features. In this paper, we propose DERP, a dynamic escape route planning method that achieves accurate disaster site avoidance and safety route planning considering fire situation awareness in a smart building. DERP is enabled by two novel designs. First, a three-dimensional (3D) fire information model is constructed by cellular automata considering the overall situation of indoor 3D topological structure, fire situation and crowd distribution. Second, a multiple constraints 3D indoor emergency escape route planning algorithm is designed based on a 3D path safety function. The experimental results show that DERP can plan and adjust the escape route timely and dynamically, thus increasing the escape probability of the trapped people.

Mitigating Cross-Technology Interference in Heterogeneous Wireless Networks based on Deep Learning

Weidong Zheng, Junmei Yao and Kaishun Wu

0
With the prosperity of Internet of Things, a large number of heterogeneous wireless devices share the same unlicensed spectrum, leading to severe cross-technology interference (CTI). Especially, the transmission power asymmetry of heterogeneous devices will further deteriorate this problem, making the low-power dev ices prohibited from data transmission and starved. This paper proposes an enhanced CCA (E-CCA) mechanism to mitigate CTI, so as to improve the performance and fairness among heterogeneous networks. E-CCA contains a signal identification design based on deep learning to identify the signal type within a tolerable time duration, it also contains a CCA adaptive mechanism based on the signal type to avoid CTI. As a result, the ZigBee devices could compete for the channel with WiFi devices more fairly, and the network performance can be improved accordingly. We set up a testbed based on TelosB, a commercial ZigBee platform, and USRP N210, which can be used as the WiFi platform. With the collected signals through USRP N210, over 99.9% signal identification accuracy can be achieved even when the signal duration is tens of microseconds. Simulation results based on NS-3 shows that E-CCA can increase the ZigBee performance dramatically with little throughput degradation for WiFi.

Accelerating PageRank in Shared-Memory for Efficient Social Network Graph Analytics

Baofu Huang, Zhidan Liu and Kaishun Wu

2
PageRank has a wide applications in online social networks and serves as an important benchmark to examine graph processing frameworks. Many efforts have been made to improve the computation efficiency of PageRank in sharedmemory platforms, where a single machine can be sufficiently powerful to handle a large-scale graph. Existing methods, however, still suffer from synchronization issues and irregular memory accesses, which will deteriorate their overall performance. In this paper, we present an accelerated parallel PageRank computation approach, named APPR. By investigating the characteristics of parallel PageRank computation and degree distributions of social network graphs, APPR proposes a series of optimization techniques to improve the efficiency of PageRank computation. Specifically, a destination-centric graph partitioning scheme is designed to avoid the synchronization issues when concurrently updating the common vertex data. By exploiting power-law structure of social network graphs, APPR can intelligently schedule the computations of vertices to save computing operations. The vertex messages are adjusted by APPR for transmission to further improve the locality of memory accesses. Empirical evaluations are performed based on a set of large real-world graphs. Experimental results show that APPR significantly outperforms the state-of-the-art methods, with on average 2.4x speedup in execution time and 16.4x reduction in DRAM communication.

Session Chair

Zhidan Liu (Shenzhen University)

Session C4

Scheduling in Edge Environment

Conference
2:50 PM — 4:10 PM HKT
Local
Dec 1 Tue, 10:50 PM — 12:10 AM PST

Using Configuration Semantic Features and Machine Learning Algorithms to Predict Build Result in Cloud-Based Container Environment

Yiwen Wu, Yang Zhang, Bo Ding, Tao Wang and Huaimin Wang

0
Container technologies are being widely used in large scale production cloud environments, of which Docker has become the de-facto industry standard. In practice, Docker builds often break, and a large amount of efforts are put into troubleshooting broken builds. Prior studies have evaluated the rate at which builds in large organizations fail. However, there is still a lack of early warning methods for predicting the Docker build result before the build starts. This paper provides a first attempt to propose an automatic method named PDBR. It aims to use the configuration semantic features extracted by AST and the machine learning algorithms to predict build result in the cloud-based container environment. The evaluation experiments based on more than 36,000 collected Docker builds show that PDBR achieves 73.45%-91.92% in F1 and 29.72%-72.16% in AUC. We also demonstrate that different ML classifiers have significant and large effects on the PDBR AUC performance.

Joint Service Placement and Computation Offloading in Mobile Edge Computing: An Auction-based Approach

Lei Zhang, Zhihao Qu, Baoliu Ye and Bin Tang

0
The emerging applications, e.g., virtual reality, online games, and Internet of Vehicles, have computation-intensive and latency-sensitive requirements. Mobile edge computing (MEC) is a powerful paradigm that significantly improves the quality of service (QoS) of these applications by offloading computation and deploying services at the network edge. Existing works on service placement in MEC usually ignore the impact of the different requirements of QoS among service providers (SPs), which is common in many applications such that online game requires extremely low latency and online video requires extremely large bandwidth. Considering the competitive relationship among SPs, we propose an auction-based resource allocation mechanism. We formulate the problem as a social welfare maximization problem to maximize effectiveness of allocated resources while maintaining economic robustness. According to our theoretical analysis, this problem is NP-hard, and thus it is practically impossible to derive the optimal solution. To tackle this, we design multiple rounds of iterative auctions mechanism (MRIAM), which divides resources into blocks and allocates them through multiple rounds of auctions. Finally, we conduct extensive experiments and demonstrate that our auction-based mechanism is effective in resource allocation and robust in economics.

Multi-user Edge-assisted Video Analytics Task Offloading Game based on Deep Reinforcement Learning

Yu Chen, Sheng Zhang, Mingjun Xiao, Zhuzhong Qian, Jie Wu and Sanglu Lu

2
With the development of deep learning, artificial intelligence applications and services have boomed in the recent years, including recommendation systems, personal assistant and video analytics. Similar to other services in the edge computing environment, artificial intelligence computing tasks are pushed to the network edge. In this paper, we consider the multi-user edge-assisted video analytics task offloading (MEVAO) problem, where users have video analytics tasks with various accuracy requirements. All users independently choose their accuracy decisions, satisfying the accuracy requirement, and offload the video data to the edge server. With the utility function designed based on the features of video analytics, we model MEVAO as a game theory problem and achieve the Nash equilibrium. For the flexibility of making accuracy decisions under different circumstances, a deep reinforcement learning approach is applied to our problem. Our proposed design has much better performance compared with some other approaches in the extensive simulations.

Accelerating Deep Learning Tasks with Optimized GPU-assisted Image Decoding

Lipeng Wang, Qiong Luo and Shengen Yan

2
In computer vision deep learning (DL) tasks, most of the input image datasets are stored in the JPEG format. These JPEG datasets need to be decoded before DL tasks are performed on them. We observe two problems in the current JPEG decoding procedures for DL tasks: (1) the decoding of image entropy data in the decoder is performed sequentially, and this sequential decoding repeats with the DL iterations, which takes significant time; (2) Current parallel decoding methods under-utilize the massive hardware threads on GPUs. To reduce the image decoding time, we introduce a pre-scan mechanism to avoid the repeated image scanning in DL tasks. Our pre-scan generates boundary markers for entropy data so that the decoding can be performed in parallel. To cooperate with the existing dataset storage and caching systems, we propose two modes of the pre-scan mechanism: a compatible mode and a fast mode. The compatible mode does not change the image file structure so pre-scanned files can be stored back to disk for subsequent DL tasks. In comparison, the fast mode crafts a JPEG image into a binary format suitable for parallel decoding, which can be processed directly on the GPU. Since the GPU has thousands of hardware threads, we propose a finegrained parallel decoding method on the pre-scanned dataset. The fine-grained parallelism utilizes the GPU effectively, and achieves speedups of around 1.5_ over existing GPU-assisted image decoding libraries on real-world DL tasks.

Session Chair

Qingyong Deng (Xiangtan University)

Session C5

Application and Security

Conference
2:50 PM — 4:10 PM HKT
Local
Dec 1 Tue, 10:50 PM — 12:10 AM PST

Scheduling Rechargeable UAVs for Long Time Barrier Coverage

Zhouqing Han, Xiaojun Zhu and Lijie Xu

0
We consider barrier coverage applications where a set of UAVs are deployed to monitor whether intruders pass through a line, i.e., the barrier. Due to limited energy supply of UAVs, a charging pile is used to recharge UAVs. The problem is to place UAVs on top of the barrier and schedule them to the charging pile such that the barrier is seamlessly covered and the total number of UAVs is minimized. We decompose the problem into subproblems by dividing the barrier into disjoint subsegments and covering each subsegment independently. We prove a theoretical lower bound on the minimum number of UAVs required to cover the barrier forever. We then propose two scheduling strategies. In the first strategy, only fully recharged backup UAVs will be scheduled to take over UAVs running out of energy. If there are enough UAVs, this strategy can cover the barrier forever. The second strategy is proposed to deal with the situation that the number of UAVs is insufficient. Under the strategy, if no backup UAV is fully recharged, the one with the most battery energy will be selected to take over the monitoring task. When the number of UAVs is insufficient, the barrier can still be covered for a long time. We analytically derive the number of UAVs required by the first strategy, and the monitoring duration of the second strategy in case of insufficient UAVs. Simulations verify the effectiveness of the proposed solutions.

Use of Genetic Programming Operators in Data Replication and Fault Tolerance

Syed Mohtashim Abbas Bokhari and Oliver Theel

1
Distributed systems are a need of the current times to balance the workload since providing highly accessible data objects is of utmost importance. Faults hinder the availability of the data, thereby leading systems to fail. In this regard, data replication in distributed systems is a means to mask failures and mitigate any such possible hindrances in the availability of the data. This replicated behavior is then controlled by data replication strategies, but there are numerous scenarios reflecting different trade-offs between several quality metrics. It demands designing new replication strategies optimized for the given scenarios, which may be left unaddressed otherwise. This research, therefore, uses an automatic mechanism based on genetic programming to construct new optimized replication strategies (up-to-now) unknown. This mechanism uses a so-called voting structure of directed acyclic graphs (each representing a computer program) as a unified representation of replication strategies. These structures are interpreted by our general algorithm at run-time in order to derive respective quorums to manage replicated objects eventually. For this, the research particularly demonstrates the usefulness of various genetic operators through their instances, exploiting the heterogeneity between existing strategies, thereby creating innovative strategies flexibly. This mechanism creates new hybrid strategies and evolves them over several generations of evolution, to make them optimized while maintaining the consistency (validity) of the solutions. Our approach is very effective and extremely flexible to offer competitive results with respect to the contemporary strategies as well as generating novel strategies even with a slight use of relevant genetic operators.

Multiple Balance Subsets Stacking for Imbalanced Healthcare Datasets

Yachao Shao, Tao Zhao, Xiaoning Wang, Xiaofeng Zou and Xiaoming Fu

1
Accurate prediction is highly important for clinical decision making and early treatment. In this paper, we study the imbalanced data problem in prediction, a key challenge existing in the healthcare area. Imbalanced datasets bias classifiers towards the majority class, leading to an unsatisfied classification prediction performance on the minority class, which is known as imbalance problem. Existing imbalance learning methods may suffer from issues like information loss, overfitting, and high training time cost. To tackle these issues, we propose a novel ensemble learning method called Multiple bAlance Subsets Stacking (MASS) by exploiting a multiple balance subsets construction strategy. Furthermore, we improve MASS with introducing parallelism (Parallel MASS) to reduce the training time cost. We evaluate MASS on three real-world healthcare datasets, and experimental results demonstrate that its prediction performance outperforms the state-of-art methods in terms of AUC, F1-score and MCC. Through the speedup analysis, Parallel MASS reduces the training time cost greatly on large dataset, and its speedup increases as the data size grows.

Securing App Behaviors in Smart Home: A Human-App Interaction Perspective

Jinlei Li, Yan Meng, Lu Zhou and Haojin Zhu

1
Smart home has become a mainstream lifestyle due to the maturity of the IoT platform and the popularity of smart devices. While offering great convenience and entertainment, smart home suffers from malicious attacks that inject improper commands and actions to home devices, which may breach the user��s safety and privacy. Traditional solutions mainly focus on generating security policies relying on app analysis to constraint apps�� behaviors. However, these policies lack flexibility to adapt to the highly dynamic smart home system. We need to consider not only the app behaviors but also the user behaviors for enforcing an appropriate security policy. In this study, we propose WIPOLICY, a cross-layer security enforcement system for smart home by monitoring the behaviors of both apps and users. The key novelty of WIPOLICY is incorporating user activity recognition via the physicallayer wireless signals into the definition and enforcement of security policies to constraint the app behavior. We implement WIPOLICY on the Samsung SmartThings platform with 187 SmartApps, and 24 behavior policies are defined and enforced. The case study demonstrates the effectiveness of WIPOLICY on thwarting app��s misbehavior.

Session Chair

Pengfei Wang (Dalian University of Technology)

Made with in Toronto · Privacy Policy · © 2022 Duetone Corp.